Search CORE

215 research outputs found

Use cases of lossy compression for floating-point data

Author: Cappello Franck
Publication venue: Barcelona Supercomputing Center
Publication date: 01/01/2019
Field of study

UPCommons. Portal del coneixement obert de la UPC

Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error-Controlled Quantization

Author: Cappello Franck
Chen Zizhong
Di Sheng
Tao Dingwen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/06/2017
Field of study

Today's HPC applications are producing extremely large amounts of data, such that data storage and analysis are becoming more challenging for scientific research. In this work, we design a new error-controlled lossy compression algorithm for large-scale scientific data. Our key contribution is significantly improving the prediction hitting rate (or prediction accuracy) for each data point based on its nearby data values along multiple dimensions. We derive a series of multilayer prediction formulas and their unified formula in the context of data compression. One serious challenge is that the data prediction has to be performed based on the preceding decompressed values during the compression in order to guarantee the error bounds, which may degrade the prediction accuracy in turn. We explore the best layer for the prediction by considering the impact of compression errors on the prediction accuracy. Moreover, we propose an adaptive error-controlled quantization encoder, which can further improve the prediction hitting rate considerably. The data size can be reduced significantly after performing the variable-length encoding because of the uneven distribution produced by our quantization encoder. We evaluate the new compressor on production scientific data sets and compare it with many other state-of-the-art compressors: GZIP, FPZIP, ZFP, SZ-1.1, and ISABELA. Experiments show that our compressor is the best in class, especially with regard to compression factors (or bit-rates) and compression errors (including RMSE, NRMSE, and PSNR). Our solution is better than the second-best solution by more than a 2x increase in the compression factor and 3.8x reduction in the normalized root mean squared error on average, with reasonable error bounds and user-desired bit-rates.Comment: Accepted by IPDPS'17, 11 pages, 10 figures, double colum

arXiv.org e-Print Archive

Crossref

BlobCR: Virtual Disk Based Checkpoint-Restart for HPC Applications on IaaS Clouds

Author: Cappello Franck
Nicolae Bogdan
Publication venue: 'Elsevier BV'
Publication date: 01/02/2013
Field of study

International audienceInfrastructure-as-a-Service (IaaS) cloud computing is gaining significant interest in industry and academia as an alternative platform for running HPC applications. Given the need to provide fault tolerance, support for suspend-resume and offline migration, an efficient Checkpoint-Restart mechanism becomes paramount in this context. We propose BlobCR, a dedicated checkpoint repository that is able to take live incremental snapshots of the whole disk attached to the virtual machine (VM) instances. BlobCR aims to minimize the performance overhead of checkpointing by persisting VM disk snapshots asynchronously in the background using a low overhead technique we call selective copy-on-write. It includes support for both application-level and process-level checkpointing, as well as support to roll back file system changes. Experiments at large scale demonstrate the benefits of our proposal both in synthetic settings and for a real-life HPC application

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Towards Efficient Live Migration of I/O Intensive Workloads: A Transparent Storage Transfer Proposal

Author: Cappello Franck
Nicolae Bogdan
Publication venue: HAL CCSD
Publication date: 01/10/2011
Field of study

Live migration of virtual machines (VMs) is key feature of virtualization that is extensively leveraged in IaaS cloud environments: it is the basic building block of several important features, such as load balancing, pro-active fault tolerance, power management, online maintenance, etc. While most live migration efforts concentrate on how to transfer the memory from source to destination during the migration process, comparatively little attention has been devoted to the transfer of storage. This problem is gaining increasing importance: due to performance reasons, virtual machines that run I/O intensive workloads tend to rely on local storage, which poses a difficult challenge on live migration: it needs to handle storage transfer in addition to memory transfer. This paper proposes a completely hypervisor-transparent approach that addresses this challenge. It relies on a hybrid active push-prioritized prefetch strategy, which makes it highly resilient to rapid changes of disk state exhibited by I/O intensive workloads. At the same time, transparency ensures a maximum of portability with a wide range of hypervisors. Large scale experiments that involve multiple simultaneous migrations of both synthetic benchmarks and a real scientiﬁc application show improvements of up to 10x faster migration time, 5x less bandwidth consumption and 62% less performance degradation over state-of-art

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

A Hybrid Local Storage Transfer Scheme for Live Migration of I/O Intensive Workloads

Author: Cappello Franck
Nicolae Bogdan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/06/2012
Field of study

International audienceLive migration of virtual machines (VMs) is key feature of virtualization that is extensively leveraged in IaaS cloud environments: it is the basic building block of several important features, such as load balancing, pro-active fault tolerance, power management, online maintenance, etc. While most live migration efforts concentrate on how to transfer the memory from source to destination during the migration process, comparatively little attention has been devoted to the transfer of storage. This problem is gaining increasing importance: due to performance reasons, virtual machines that run large-scale, data-intensive applications tend to rely on local storage, which poses a difficult challenge on live migration: it needs to handle storage transfer in addition to memory transfer. This paper proposes a memory-migration independent approach that addresses this challenge. It relies on a hybrid active push / prioritized prefetch strategy, which makes it highly resilient to rapid changes of disk state exhibited by I/O intensive workloads. At the same time, it is minimally intrusive in order to ensure a maximum of portability with a wide range of hypervisors. Large scale experiments that involve multiple simultaneous migrations of both synthetic benchmarks and a real scientific application show improvements of up to 10x faster migration time, 10x less bandwidth consumption and 8x less performance degradation over state-of-art

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL-Rennes 1

BlobCR: Efficient Checkpoint-Restart for HPC Applications on IaaS Clouds using Virtual Disk Image Snapshots

Author: Cappello Franck
Nicolae Bogdan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/11/2011
Field of study

International audienceInfrastructure-as-a-Service (IaaS) cloud computing is gaining significant interest in industry and academia as an alternative platform for running scientific applications. Given the dynamic nature of IaaS clouds and the long runtime and resource utilization of such applications, an efficient checkpoint-restart mechanism becomes paramount in this context. This paper proposes a solution to the aforementioned challenge that aims at minimizing the storage space performance overhead of checkpoint-restart. We introduce a framework that combines checkpoint-restart protocols at guest level with virtual machine (VM) disk-image multi-snapshotting and multi-deployment at host level in order to efficiently capture and potentially roll back the complete state of the application, including file system modifications. Experiments on the G5K testbed show substantial improvement for MPI applications over existing approaches, both for the case when customized checkpointing is available at application level and the case when it needs to be handled at process level

HAL-CentraleSupelec

HAL - Lille 3

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1